Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity
نویسندگان
چکیده
Improved open-vocabulary spoken content retrieval with word and subword lattices using acoustic feature similarity Hung-yi Lee a,∗, Po-wei Chou b, Lin-shan Lee a a Graduate Institute of Communication Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan b Department of Electrical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei 10617, Taiwan
منابع مشابه
Open-Vocabulary Retrieval of Spoken Content with Shorter/Longer Queries Considering Word/Subword-based Acoustic Feature Similarity
Acoustic feature similarity between utterances has been shown to be very helpful for spoken term detection using pseudorelevance feedback (PRF) and graph-based re-ranking. Both cases are based on the concept that utterances similar to those utterances with higher relevance scores in acoustic features should have higher scores, while graph-based re-ranking further considers the similarity struct...
متن کاملSpoken Term Detection and Spoken Content Retrieval: Evaluations on NTCIR 11 SpokenQuery&Doc Task
In this paper, we report out experiments on NTCIR-11 SpokenDoc&Query task for spoken term detection (STD) and spoken content retrieval (SCR). In STD, we consider acoustic feature similarity between utterances over both word and sub-word lattices to deal with the general problem of open vocabulary retrieval with queries of variable length. In SCR, we modify term frequency using expected term fre...
متن کاملOpen-vocabulary spoken document retrieval based on new subword models and subword phonetic similarity
A new type of video retrieval system is proposed that identifies a target video section by searching for a word passage submitted as a quoted speech or text query. The proposed system has two unique characteristics. The first characteristic is that it is based on subword models such as phonemes, syllables, and morphemes so the system is able to deal with any type of query, including new words a...
متن کاملMultilayer subword units for open-vocabulary spoken document retrieval
This paper describes the application of subword units in an effort of improving open-vocabulary spoken document retrieval performance in the case of highly corrupted recognition output. This paper presents the developed open-vocabulary spoken document retrieval system including the newly proposed subphonetic segment unit and combining multilayer subword units. Our experiments on Japanese spoken...
متن کاملAn approach for efficient open vocabulary spoken term detection
A hybrid two-pass approach for facilitating fast and efficient open vocabulary spoken term detection (STD) is presented in this paper. A large vocabulary continuous speech recognition (LVCSR) system is deployed for producing word lattices from audio recordings. An index construction technique is used for facilitating very fast search of lattices for finding occurrences of both in vocabulary (IV...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computer Speech & Language
دوره 28 شماره
صفحات -
تاریخ انتشار 2014